Database Content Exploration and Exploratory Analysis of User Queries
نویسنده
چکیده
Content providers, such as enterprises and organizations who publish their content on the Internet, aim at making their content visible and easily accessible to the users. The vast amount of data contained in databases impedes their efforts, as users often find it challenging to navigate through the available data and find the items that best suit their needs. It is therefore necessary for content providers to motivate users to explore the available data and assist them in finding items that are interesting to them. State-of-the-art approaches such as top-k queries are not appropriate for data exploration as they require the users to be aware of the database structure and the content they are exploring. In this thesis, we study the problem of enhancing the visibility of database content through exploratory search and analysis. We propose exploratory algorithms that return to the user a small number of results, which at the same time provide a wide overview of the available content. In addition, we present algorithms that identify items that are appealing to users and can be exploited for offering users an insight of the available items and motivating them to explore the database. In particular, the main contributions of the thesis are: • We develop a framework for organizing and summarizing keyword search results based on their textual content and temporal data. • We introduce a new type of query, the eXploratory Top-k Join (XTJk) query, which creates object combinations that are better suited to user preferences than single objects, and we present algorithms for the efficient processing of XTJk queries. • We introduce the continuous influential query, which returns objects that are continuously attractive to a large number of users for long periods, and we present algorithms for the efficient retrieval of continuous influential objects. • We model the diversity of database objects based on user preferences, and we propose efficient algorithms for selecting products that are attractive to a wide range of users with diverse preferences. • We describe the Best-terms problem which is the problem of increasing the rank of a spatio-textual object through the enhancement of its textual description. We show that the problem is NP-hard and we present approximate algorithms that retrieve high quality results. The proposed approaches have been evaluated through extensive experimental evaluation. The experiments were conducted using both synthetic and real datasets and demonstrate the efficiency of the proposed methods.
منابع مشابه
Towards Supporting Exploratory Search over the Arabic Web Content: The Case of ArabXplore
Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from exte...
متن کاملA Progressive Query Materialization for Interactive Data Exploration
Data analysis is task of competently digging out data insights even if user is uncertain but database systems unable to meet. User’s Iterative interactions with system might be an alternative. Interactive data exploration (IDE) is one such system, supports exploration by incorporating user intention. IDE is key ingredient of many discovery applications, such as scientific computing, financial a...
متن کاملAIDE: An Automated Sample-based Approach for Interactive Data Exploration
In this paper, we argue that database systems be augmented with an automated data exploration service that methodically steers users through the data in a meaningful way. Such an automated system is crucial for deriving insights from complex datasets found in many big data applications such as scientific and healthcare applications as well as for reducing the human effort of data exploration. T...
متن کاملFutureView: Enhancing Exploratory Image Search
Search algorithms in image retrieval tend to focus on giving the user more and more similar images based on queries that the user has to explicitly formulate. Implicitly, such systems limit the users exploration of the image space and thus remove the potential for serendipity. As a response, in recent years there has been an increased interest in developing content based image retrieval systems...
متن کاملPICASSO: Exploratory Search of Connected Subgraph Substructures in Graph Databases
Recently, exploratory search has received much attention in information retrieval and database fields. This search paradigm assists users who do not have a clear search intent and are unfamiliar with the underlying data space. Specifically, query formulation evolves iteratively as the user becomes more familiar with the content. Despite its growing importance, exploratory search on graph-struct...
متن کامل